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Abstract. We consider a generalization of the secretary problem where 
contracts are temporary, and for a fixed duration 7 . This models online 
hiring of temporary employees, or online auctions for re-usable resources. 
The problem is related to the question of finding a large independent set 
in a random unit interval graph. 


1 Introduction 

This paper deals with a variant of the secretary model, where contracts are tem¬ 
porary. E.g., employees are hired for short-term contracts, or re-usable resources 
are rented out repeatedly, etc. If an item is chosen, it “exists” for a fixed length 
of time and then disappears. 

Motivation for this problem are web sites such as Airbnb and oDesk. Airbnb 
offers short term rentals in competition with classic hotels. A homeowner posts a 
rental price and customers either accept it or not. oDesk is a venture capitalizing 
on freelance employees. A firm seeking short term freelance employees offers a 
salary and performs interviews of such employees before choosing one of them. 

We consider an online setting where items have values determined by an 
adversary, (“no information” as in the standard model m), combined with 
stochastic arrival times that come from a prior known distribution (in contrast 
to the random permutation assumption and as done in mm)- Unlike much 
of the previous work on online auctions with stochastic arrival/departure timing 
(|18j). we do not consider the issue of incentive compatibility with respect to 
timing, and assume that arrival time cannot be misrepresented. 

The temp secretary problem can be viewed 

1. As a problem related to hiring temporary workers of varying quality sub¬ 

ject to workplace capacity constraints. There is some known prior F{x) = 
fo on the arrival times of job seekers, some maximal capacity, d, on 

the number of such workers that can be employed simultaneously, and a 
bound k on the total number than can be hired over time. If hired, workers 
cannot be fired before their contract is up. 

2. Alternately, one can view the temp secretary problem as dealing with social 
welfare maximization in the context of rentals. Customers arrive according 




to some distribution. A firm with capacity d can rent out up to d boats 
simultaneously, possibly constrained to no more than k rentals overall. The 
firm publishes a rental price, which may change over time after a customer 
is serviced. A customer will choose to rent if her value for the service is 
at least the current posted price. Such a mechanism is inherently dominant 
strategy truthful, with the caveat that we make the common assumption 
that customers reveal their true values in any case. 

We give two algorithms, both of which are quite simple and offer posted 
prices for rental that vary over time. Assuming that the time of arrival cannot 
be manipulated, this means that our algorithms are dominant strategy incentive 
compatible. 

For rental duration 7, capacity d = 1 , no budget restrictions, and arrival 
times from an arbitrary prior, the time-slice algorithm gives a ^ competitive 
ratio. For arbitrary d the competitive ratio of the time-slice algorithm is at least 
( 1 / 2 ) • (1 — 5 /Vd). This can be generalized to more complex settings, see Table [ 2 j 
The time slice algorithm divides time into slices of length 7. It randomly decides 
if to work on even or odd slices. Within each slice it uses a variant of some 
other secretary problem (F.g., [ 26 ], | 2 ], [ 24 ]) except that it keeps track of the 
cumulative distribution function rather than the number of secretaries. 

The more technically challenging Charter algorithm is strongly motivated 
by the ^-secretary algorithm of [ 24 ) . For capacity d, employment period 7, and 
budget d < k < d/j (the only relevant values), the Charter algorithm does the 
following: 

— Recursively run the algorithm with parameters 7, [fc/ 2 j on all bids that 
arrive during the period [0,1/2). 

— Take the bid of rank \k/ 2 \ that appeared during the period [ 0 , 1 / 2 ), if such 
rank exists and set a threshold T to be it’s value. If no such rank exists set 
the threshold T to be zero. 

— Greedily accept all items that appear during the period [ 1 / 2 , 1 ) that have 
value at least T — subject to not exceeding capacity (d) or budget (fc) 
constraints. 


For d = 1 the competitive ratio of the Charter algorithm is at least 

^(i-A_7.4777I77). 

Two special cases of interest are A: = I/7 (no budget restriction), in which case 
the expression above is at least 5^1” 12.4Y^7ln(l/7)^. We also show an upper 

bound of 1/2 + 7/2 for 7 > 0 . As 7 approaches zero the two bounds converge to 
1 / 2 . Another case of interest is when k is fixed and 7 approaches zero in which 
this becomes the guarantee given by Kleinberg’s fc-secretary algorithm. 

For arbitrary d the competitive ratio of the Charter algorithm is at least 



1-0 


0(7 log (1/7)). 






We remark that neither the time slice algorithm nor the Charter algorithm 
requires prior knowledge of n, the number of items due to arrive. 

At the core of the analysis of the Charter algorithm we prove a bound on 
the expected size of the maximum independent set of a random unit interval 
graph. (See Table [T|). In this random graph model we draw n intervals, each 
of length 7 , by drawing their left endpoints uniformly in the interval [0,1). We 
prove that the expected size of a maximum independent set in such a graph is 
about n/(l + nj). We say that a set of length 7 segments that do not overlap is 
7 -independent. Similarly, a capacity d 7 -independent set allows no more than d 
segments overlapping at any point. 

Note that if 7 = 1/n then this expected size is about 1/2. This is intuitively 
the right bound as each interval in the maximum independent set rules out on 
average one other interval from being in the maximum independent set. 

We show that a random unit interval graph with n vertices has a capacity 
d 7 -independent subset of expected size at least min(n, c?/ 7 )(l — 0 (-\/ln d/\fd)). 
We also show that when n = dj^ the expected size of the maximum capacity d 
7 -independent subset is no more than n(l — 0{l/'/d)). These results may be of 
independent interest. 

Related work: Worst case competitive analysis of interval scheduling has a long 
history, e.g., [aoEH]. This is the problem of choosing a set of non-overlapping 
intervals with various target functions, typically, the sum of values. 

m introduce the question of auctions for reusable goods. They consider a 
worst case mechanism design setting. Their main goal is addressing the issue of 
time incentive compatibility, for some restricted set of misrepresentations. 

The secretary problem is arguably due to Johannes Kepler (1571-1630), and 
has a great many variants, a survey by m contains some 70 references. The 
“permutation” model is that items arrive in some random order, all n\ permu¬ 
tations equally likely. Maximizing the probability that the best item is chosen, 
when the items appear in random order, only comparisons can be made, and 
the number of items is known in advance, was solved by m and by A 
great many other variants are described in f [15lll) l. differing in the number of 
items to be chosen, the target function to be maximized, taking discounting into 
account, etc. 

An alternative to the random permutation model is the stochastic arrival 
model, introduced by Karlin |21| in a “full information” (known distribution 
on values) setting. Bruss [7] subsequently studied the stochastic arrival model 
in a no-information model (nothing is known about the distribution of values). 
Recently, [13] made use of the stochastic arrival model as a tool for the analysis 
of algorithms in the permutation model. 

Much of the recent interest in the secretary problem is due to it’s connection 
to incentive compatible auctions and posted prices |18l24l2l3llli0] . 

Most directly relevant to this paper is the /c-secretary algorithm by R. Klein- 
berg [24]. Constrained to picking no more than k secretaries, the total value of 
the secretaries picked by this algorithm is at least a (1 ~ ■^) of the value of the 
best k secretaries. 
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Babaioff et al. [5] introduced the knapsack secretary problem in which every 
secretary has some weight and a value, and one seeks to maximize the sum 
of values subject to a upper bound on the total weight. They give a l/(10e) 
competitive algorithm for this problem. (Note that if weights are one then this 
becomes the /c-secretary problem). The Matroid secretary problem, introduced 
by Babaioff et al. [1] , constrains the set of secretaries picked to be an independent 
set in some underlying Matroid. Subsequent results for arbitrary Matriods are 

given in mm- 

Another generalization of the secretary problem is the online maximum bi¬ 
partite matching problem. See [25122] . Secretary models with full information or 
partial information (priors on values) appear in and [29] . This was in the con¬ 
text of submodular procurement auctions ([ 3 ) and budget feasible procurement 
(US])- Other papers considering a stochastic setting include [23117] . 

In our analysis, we give a detailed and quite technical lower bound on the size 
of the maximum independent set in a random unit interval graph (produced by 
the greedy algorithm). Independent sets in other random interval graph models 
were previously studied in [201916] . 

2 Formal Statement of Problems Considered 

Each item x has a value v{x), we assume that for all x ^ y, v{x) ^ v{y) by 
consistent tie breaking, and we say that a; > y iff v{x) > v{y). Given a set of 
items A, define v{X) = Tk{X) = maxTcx,|T|<fc ^^(T). 

Given a set X and a density distribution function / defined on [0,1), let 
dy:Ai—>'[ 0 ,l)bea random mapping where 0f{x) is drawn independently from 
the distribution /. The function d/ is called a stochastic arrival function, and we 
interpret 0f{x), x G X, to be the time at which item x arrives. For the special 
case in which / is uniform we refer to 0f as 9. 

In the problems we consider, the items arrive in increasing order of 0/. If 
df{x) = 9f{y) the relative order of arrival of x and y is arbitrary. An online 
algorithm may select an item only upon arrival. If an item x was selected, we 
say that the online algorithm holds x for 7 time following 9f (x). 

An online algorithm A for the temp secretary problem may hold at most one 
item at any time and may select at most k items in total. We refer to k as the 
budget of A. The goal of the algorithm is to maximize the expected total value 
of the items that it selects. We denote by A{X,9f) the set of items chosen by 
algorithm A on items in X appearing according to stochastic arrival function 
0 /- 

The set of the arrival times of the items selected by an algorithm for the 
temp secretary problem is said to be 'y-independent. Formally, a set S C [0,1) is 
said to be y-independent if for all tiA 2 G S, ti ^ t 2 we have that \ti — ^ 2 ! > 7 - 

Given 7 > 0, a budget k, a set X of items, and a mapping 9f : X 1 -^ [0,1) 
we define Opt(A, 0/) to be a 7 -independent set S, l^l < fc, that maximizes the 
sum of values. 
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Given rental period 7 > 0, distribution /, and budget k, the competitive 
ratio of an online algorithm A is defined to be 

^ Eg^.Xh->.[o,i)b(Opt(X, 0 j))] 

The competitive ratio of the temp secretary problem is the supremum over all 
algorithms A of the competitive ratio of A. 

Note that when 7 —>■ 0, the the temp secretary problem reduces to Kleinberg’s 
fc-secretary problem. 

We extend the 7 -temp secretary problem by allowing the algorithm to hold 
at most d items at any time. Another extension we consider is the knapsack temp 
secretary problem where each item has a weight and we require the set held by 
the algorithm at any time to be of total weight at most W. Also, we define the 
Matroid temp secretary problem where one restricts the set of items held by the 
algorithm at any time to be an independent set in some Matroid M. 

More generally, one can define a temp secretary problem with respect to some 
arbitrary predicate P that holds on the set of items held by an online algorithm 
at all times t. This framework includes all of the variants above. The optimal 
solution with respect to P is also well defined. 

3 The time-slice Algorithm. 

In this section we describe a simple time slicing technique. This gives a reduc¬ 
tion from temp secretary problems, with arbitrary known prior distribution on 
arrival times, to the “usual” continuous setting where secretaries arrive over 
time, do not depart if hired, and the distribution on arrival times is uniform. 
The reduction is valid for many variants of the temp secretary problem, includ¬ 
ing the Matroid secretary problem, and the knapsack secretary problem. We 
remark that although the Matriod and Knapsack algorithms are stated in the 
random permutation model, they can be replaced with analogous algorithms in 
the continuous time model and can therefore be used in our context. 

We demonstrate this technique by applying it to the classical secretary prob¬ 
lem (hire the best secretary). We obtain an algorithm which we call Slice-y for 
the temp secretary problem with arbitrary prior distribution on arrival times 
that is 0 ( 1 ) competitive. 

Consider the I /27 time intervals (i.e. slices) Ij = [2'yj, 2^{j -|- 1)), 0 < 
j < 1 /( 27 ) ~ 1 - We split every such interval into two, Ij = [2jj, 2jj + 7 ), 
Ij = [ 27 / + 7, 27(j -f 1))0 

Initially, we flip a fair coin and with probability 1/2 decide to pick points 
only from the left halves (l^’s) or only from the right halves (dj’s). In each such 
interval we pick at most one item by running the following modification of the 
continuous time secretary algorithm. 

^ For simplicity we assume that 1/(27) is an integer. 
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The continuous time secretary algorithm m observes the items arriving 
before time 1/e, sets the largest value of an observed item as a threshold, and 
then chooses the first item (that arrives following time 1/e) of value greater than 
the threshold. The modified continuous time secretary algorithm observes items 
as long as the cumulative distribution function of the current time is less than 
1 /e, then it sets the largest value of an observed item as a threshold compute a 
threshold, and choose the next item of value larger than the threshold. 

It is clear that any two points picked by this algorithm have arrival times 
separated by at least 7 . 

Theorem 1. The algorithm Sliccj is l/(2e) competitive. 

Proof. The analysis is as follows. Fix the mapping of items to each of the left 
intervals /|’s and to each of the right intervals /J’s (leaving free the assignment 
of items to specific arrival times within their the intervals they are assigned to). 
Let OPT^ [OPT'') be the sum of the items of maximum value over all intervals 
(Ij). Let OPT be the average optimal value conditioned on this mapping of 
items to intervals. Clearly, 

OPT^ + OPT'' > OPT. (2) 

For any interval I j’s [Ij’s) Slice^ gain at least 1/e over the top value in the 
interval conditioned on the event that Slice^ doesn’t ignore this interval, this 
happens with probability 1/2. Therefore the expected sum of values achieved by 
Sliccj is at least 

- ■ -OPT^ + - ■ -OPT'' . (3) 

2 e 2 e 

Substitution ([2]) into ([3l) we get the lemma. □ 

Appropriately choosing times (rather than number of elements) as a func¬ 
tion of the prior distribution allows us to do the same for other variants of the 
secretary problem, the Knapsack (achieving a competitive ratio of ^ see 

[5]) and Matriod (O(lnlnp) when p is the rank of the Matroid, see |26I14) L 

4 Improved results for the temp secretary problem for 
the uniform arrival distribution 

In this section we give an improved algorithm, referred as the charter algorithm 
Cfc-y, for the temp secretary problem with uniform arrival times and capacity 1 
(at most one secretary can be hired at any time). 

As it is never the case that more than I /7 items can be selected, setting k = 
[ 1 / 7 ] effectively removes the budget constraint. Note that Ckp is Kleinberg’s 
algorithm for the fc-secretary problem, with some missing details added to the 
description. 

To analyze the charter algorithm we establish a lower bound on the expected 
size of the maximum 7 -independent subset of a set of uniformly random points 
in [0,1). We apply this lower bound to the subset of the items that Kleinberg’s 
algorithm selects. 
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4.1 The temp secretary algorithm, Ck^-yi a competitive ratio of 

1/(1 + kj). 

This charter algorithm, Ck^-y gets parameters k (the maximal number of rentals 
allowed) and 7 (the rental period) as is described in detail in Algorithm [TJ As 
the entire period is normalized to [0,1), having k > [I/ 7 ] is irrelevant. Thus, 
we assume that /c < [1 / 7 ] 0 

We show that Ck,y{X) gains in expectation about 1/(1 + k^) of the top k 
values of X, which implies that the competitive ratio (see definition ([I}) of Ck,y 
is at least about 1/(1 + ^ 7 ). 

Note that for k = [I/ 7 ], Ck,-y has a competitive ratio close to 1/2, while for 
7 = 0 , Ck^-y has a competitive ratio close to 1 . 

It is easy to see that Ck,y, chooses a 7 -independent set of size at most k. 
The main theorem of this paper is the following generalization of Kleinberg’s 
/c-secretary problem: 

Theorem 2. For any set of items S = 0 < 7 < 7 * = 0.003176 and 

any positive integer k < 1 / 7 .' 

Ee:5^[o,i]b(C'fe.^(5,0))] > ^^(l-^(7,fc))rfc(5), (4) 

where /3(7, k) = 7 . 41 ^ 7 ln(l/ 7 ) + and the expectation is taken oven all uni¬ 
form mappings of S to the interval [0,1). (Note that the right hand side of 
Equation m is negative for 7 * < 7 < 0.5./ 

4.2 Outline of the proof of Theorem [2] 

We prove Theorem[2]by induction on k. For fc < 25 the theorem holds vacuously. 

The profit, p[°4/2)^ qjj those items that arrive during the time interval [0,1/2) 
is given by the inductive hypothesit0. However, the inductive hypothesis gives 
this profit, p[°4/2)^ terms of the top [fc/ 2 j elements that arrive before time 
1/2, and not in terms of Tk{X), the value of the top k items overall. Thus, we 
need to relate p[ 0 d/ 2 ) Tk{X). In Lemma[2|we show that p[°4/2) jg about 1/2 
of rfc(A). 

Let ZyT be the set of items that arrive in the time interval [1/2,1) and have 
value greater than the threshold T. From Z^t "we greedily pick a 7 -independent 
subseo It is easy to see that this set is in fact a maximal 7 -independent subset. 

To bound the expected profit from the items in Z^t we first bound the size 
of the maximal 7 -independent set amongst these items. To do so we use the 
following general theorem (see also Section [51 and Section [C]) . 

^ To simplify the presentation we shall assume the in sequel that fc < I/ 7 . 

^ This profit, p[oh/ 2 ) jg E 9 ,s_>[o,i) (S', t*))!, where the set of 

items chosen by the algorithm during the time period [ 0 , 1 / 2 ). 

modulo the caveat that the arrival time of the 1 st item chosen from the 2 nd half 
must be at least 7 after the arrival time of the last item chosen in the 1 st half. 
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ALGORITHM 1: The Charter Algorithm (7^,7. 
if k = 1 then 

/* Use the ^‘continuous secretary'^ algorithm [mil : */ 

Let X be the largest item to arrive by time 1/e (if no item arrives by time 1/e — 
let X be the absolute zero, an item smaller than all other items). 

Ck,~i accepts the first item y, y > x, that arrives after time 1/e (if any) 
else 

/* Process the items scheduled during the time interval [0, 1/2) */ 

Initiate a recursive copy of the algorithm, C' = C\k/ 2 \, 2 'y 

X t— next element // If no further items arrive a: t— 0 

while X AND 9{x) <1/2 do 

Simulate C' with input x and modified schedule 0'{x) = 29 (x). 
if C' accepts x then 
|_ Ck,-^ accepts x 

r next element // If no further items arrive, r t—0 

/* Determine threshold T */ 

Sort the items that arrived during the time interval [0, 1/2): yi > j /2 > ■ • ■ > 2 /m 
(with consistent tie breaking). 

Let r = l'fc/2]. 
if m < T then 

set T to be the absolute zero 

else 

set r •<— 2 /t. 

/* Process the items scheduled during the time interval [1/2, 1) */ 

do 

if X > T AND (9{x) > 9{x') + 7 where x' is the last item accepted by Ck,-y 
OR no items have been previously accepted) then 
|_ Ck,'y accepts x 

a; t—next element // If no further items arrive, r •/—0 

until a; = 0 OR k items have already been accepted 


Theorem 3. Let Z = {zi, Z 2 , ■ ■ ■, Zn} be a set of independently uniform sam¬ 
ples, Zi, from the real interval [0,1). For 0 < 7 < 1, 

Ez[m(Z, 7 )] > ^ ^ 87 / 7 ln(l/ 7 ), (5) 

7 + 1 /n 1 + 717 

where m{Z,j) denotes the size of the largest rj-independent subset of Z. 

We apply Theorem [3] to the items in Z^p- We can apply this theorem since 
arrival times of items in .Z>t are uniformly distributed in the 2nd half. Specif¬ 
ically, we give a lower bound on the expected profit of the algorithm from the 
items in the 2 nd half as follows: 


1. Condition on the size of Z^p- 












2. Subsequently, condition on the set of arrival times {0i, 02 , • ■ •, 

items in Z-^t but not on which item in Z-^t arrives when. This conditioning 
fixes the 7 -independent set selected greedily by the algorithm. 

3. We take the expectation over all bijections 0 whose image on the domain 
ZyT is the set {0i, 02, ■ • ■, 0 |z>t|}- The expected profit (over the set Z>t 
and over these bijections) is “approximately” 


Size of maximal 7 -independent set from 



v{z). ( 6 ) 


\Z>t\ 


The “approximately” is because of some technical difficulties: 

— We cannot ignore the last item amongst those arriving prior to time 1/2. 
If one such item was chosen at some time 1/2 — 7 <f<l /2 then arrivals 
during the period [1/2, t -I- 7 ) cannot be chosen. 

— We cannot choose more than k items in total, if the algorithm choose A 
items from the time interval [0,1/2), it cannot choose more than k — X 
items from the time interval [1/2,1), but k — X may be smaller than the 
size of the 7 -independent set from Z^t- 

4. To get an unconditional lower bound we average Equation ([ 6 |) over the pos¬ 
sible sizes of the 7 -independent set as given by Theorem [3] 

5 Upper bound for the temp secretary problem with 
uniform arrival times and with no budget restriction 

Theorem 4. For the temp seeretary problem where item arrival times are taken 
from the uniform distribution, for any 7 G ( 0 , 1 ), any online algorithm (poten¬ 
tially randomized) has a eompetitive ratio < 1 / 2 - 1 - 7 / 2 . 

Proof. Let A denote the algorithm. Consider the following two inputs: 

1. The set S of n-1 items of value 1. 

2. The set S' = S U {xn} where v{xn) = 00 . 

Note that these inputs are not of the same size (which is ok as the number of 
items is unknown to the algorithm). 

Condition the mapping 9 : S ^ [0,1) (but not the mapping of Xn)- If A 
accepts an item x at time 6{x) we say that the segment [x, x -I- 7 ) is covered. For 
a fixed 0 let g[9) be the expected fraction of [0,1) which is not covered when 
running A on the set S with arrival times 0. This expectation is over the coin 
tosses of A. Let G be Eg. 5 ,_,,[Q i)[ 5 ( 0 )]. 

The number of items that A picks on the input S with arrival time 0 is at 
most -I-1. Taking expectation over all mappings 0 : S' >->■ [0,1) we get that 

the value gained by A is at most (1 — G ')/7 -I-1. 

As n —00 the optimal solution consists of [1 /y] items of total value [I/ 7 I. 
Therefore the competitive ratio of A is at most 



( 7 ) 
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Note that g{0) is exactly the probability that A picks Xn on the input S'U{a;„} 
(this probability is over the mapping of Xn to [ 0 , 1 ) conditioned upon the arrival 
times of all the items in S C S'). Therefore the competitive ratio of A on the 
input S' is 

ngm = G. ( 8 ) 

Therefore the competitive ratio of A is no more than the minimum of the two 
upper bounds d?]) and (I5|) 

min (G, 1 — G + 7 ) < 1/2 + 7/2 . 

□ 


6 About Theorem |3} A Lower bound on the expected 
size of the maximum 7 -independent subset 


Recall the definition of Z and m{Z,'-f) from Theorem [31 

Define the random variable 1 ^ ^ ^ to b® the i’th smallest point in 

Z. Define the random variable Ci to be the number of points from Z that lie in 
the interval [Xi, Xi + 7 ). Note that at most one of these points can belong to a 
7 -independent set. 

The greedy algorithm constructs a maximal 7 -independent set by traversing 
points of Z from the small to large and adding a point whenever possible. Let 
li be a random variable with binary values where /^ = 1 iff Xi was chosen by 
the greedy algorithm. It follows from the definition that It gives the size of 
the maximal independent set, m(Z,"f), and that liCi = n. 

Note that E[Gi] < 1 -l-ny, one for the point Xi itself, and 717 as the expected 
number of uniformly random points that fall into an interval of length 7 . If Ci 
and li were independent random variables, it would follow that 


E 




/zG, 


< (1 -I- 777 ) ^ Prob[ li = 1] 


and, thus, 

miZ, 7 ) = ^ /i > n/(l -b 717 ). 

Unfortunately, Ci and li are not independent, and the rest of the proof of 
Theorem |3| in Section [C] primarily deals with showing that this dependency is 
insignificant. 


7 Discussion and Open Problems 

We’ve introduced online optimization over temporal items under stochastic in¬ 
puts subject to conditions of two different types: 

— “Vertical” constraints: Predicates on the set of items held at all times t. In 
this class, we’ve considered conditions such as no more than d simultaneous 
items held at any time, items held at any time of total weight < W, items 
held at any time must be independent in some Matroid. 
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— “Horizontal” constraints: Predicates on the set of items over all times. Here, 
we’ve considered the condition that no more than k employees be hired over 
time. 

One could imagine much more complex settings where the problem is defined 
by arbitrary constraints of the first type above, and arbitrary constraints of the 
2nd type. For example, consider using knapsack constraints in both dimensions. 
The knapsack constraint for any time t can be viewed as the daily budget for 
salaries. The knapsack constraint over all times can be viewed as the total budget 
for salaries. Many other natural constraints suggest themselves. 

It seems plausible that the time slice algorithm can be improved, at least in 
some cases, by making use of information revealed over time, as done by the 
Charter algorithm. 
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APPENDIX 


A Table of results 


Table 1. Size of maximal 7 -independent set. Uniformly prior on arrivals. 


Size of maximal 
7 -independent set 
capacity d = 1 


Theorem |3 

^(l-e(V7ln(l/7))) 

Size of maximal 
7 -independent set 
of capacity d 
n = d • 1 / 7 . 


Lower bound — Theorem [5] 
Lower bound — Theorem [ 6 ] 

Size of maximal 
7 -independent set 
of capacity d 

>min(n,d/ 7 )(l-©(^)) 
< min(n, d/ 7 ) 

Theorem [5] 


Table 2. Competitive ratios, arbitrary prior on arrivals 


Capacity one 7 -independent subset 

1 

2e 

Theorem [T] 

Matroid constraints 

17(1/ In In p) 

|26| & Theorem [T 

Knapsack constraints 

1 

20e 

[2] & Theorem [1] 

Capacity d 

7 -independent set 


[24] & Theorem I 


13 
















B Proving the main Theorem 


Theorem 2. For any set of items S = 0 < 7 < 7* = 0.003176 and 

any positive integer k < 1/7: 

Ee:S^[o.i]KC^fe.7(5,0))] > (4) 

where = 7 . 4 -^ 7111 ( 1 / 7 ) + and the expectation is taken oven all uni¬ 

form mappings of S to the interval [0,1)- (Note that the right hand side of 
Equation 0 is negative for 7* < 7 < 0.5./ 

Proof. The proof is via induction on k. For the base case of the induction we 
note that for any k < 25 the theorem holds since ^1 — < 0. We hereby 

assume that the theorem holds for any k' < fc, and prove the statement for k. 

The proof is presented top-down. We refer to Lemmata mm and [5l whose 
statement and proof appear subsequently. 

Let C'[*’4/2) 8) and (x, 9) be the subsets of X chosen by algorithm 

Ck.-y during the time intervals [0,1/2) and [1/2,1)) respectively when applied to 

x',e. 

For the induction step we use the fact that 


E6l:Sh->.[0,l)b(C'fe,7('S', d))] 


= E, 




; (5, 9)) ] + [v (5, d)) 


(9) 


We give a lower bound on the hrst term in (jH]) using the induction hypothesis 
(after an appropriate transformation) and we directly lower bound the second 
term using Lemma (5] 

Since Lemma [5] requires that the size n of S' is sufficiently large relative to k 
(which may not be the case) we introduce a modification of Ck^-y, Algorithmic 
— denoted by 

By Lemma [ 1 ] we have that 


Ee:S^[0.1)b(Cfc.7(S,d))] > E,,s^[o,i)[u(C'fc%(S,0))]. (10) 


By definition of C/ above we have that 




= E 




= E 




(^*[0,1/2) 


+ Ee:SH->[o,i) 

+ Eg/:(SuD)i->-[0,l) 


u(ci^f')(SUA0'))](ll) 


The mapping 9' in Equation (HU, with domain SUD, combines d : S !->■ [0,1) 
and 9d '■ D ^ [0,1), this is well defined because S and D are disjoint. 

We give a lower bound for the hrst term in Equation m using the inductive 
hypothesis. Using Lemma|5]we derive a lower bound for the 2nd term in Equation 

(HU). 
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ALGORITHM 2: 


1 During the time interval [0,1/2), emulates Ck,-y, i-e., (S) = (S'). 

2 D <— a collection of 3[fc/2] distinguishable dummy items with zero value. 

3 00 ■<— random uniform mapping D i—^ [0,1) 

4 U ^{x€S\ e{x) < 1/2} U {d € D I 0o(d) < 1/2}. 

/* set the threshold T* */ 


5 

6 
7 


r = |■fc/2] 
if |[/| > r then 

T* ■<— the T largest value amongst U 

/* Note that if the number of items from S mapped to the interval 
[0,1/2) is > r, then sets the same threshold as Ck,-y 


*! 


8 else 

9 T* ■(— absolute zero // smaller than any other item 


/* the arrival time of the next item will be after time 1/2 

10 a; ■<— next item from S U D // If no further items arrive, x 

11 while a; yf 0 AND no more than k — 1 items have already been accepted do 
if a: > T* AND (0{x) > 0(x') + 7 where x' is the last item accepted by 

OR no items have been previously accepted) then 


12 

13 

14 


|_ accepts x 
X next item from S U D 


*/ 


II If no further items arrive, a; •<— 0 


To simplify the notation hereinafter we abbreviate Eg. 5 ,_^[Q 0))] 

asE[Cfc,^(S)]. 

We first give a lower bound on the first term in Equation dm. Given a set 
of items S, fix the set of items C S arriving in [0,1/2). The arrival times 

of S'[°d/2) uniform in [0,1/2). Therefore conditioned on arriving in 

[0,1/2), the expected profit of from equals 


E 




By induction we obtain 


E 
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It therefore follows that the expected profit of algorithm Ck,-y from elements 
arriving in [ 0 , 1 / 2 ) (without any conditioning on is 


E 


^[0.1/2) 


iS) 


= E 


5[o,i/2)cS 

> E5[o,i/2)cS 

1 


E 

C'Lfe/2j,27( 

■ 

1 

.l + 27 [fc/ 2 j 


( 1 -/I( 27 , Lfc/ 2 j))-rLfe/ 2 j 

) 


-(1-/3(27, Lfc/2j))-Es[oa/2)c5 


Tik/2i 


> 


l + 27 [fc/ 2 j 

^ (1 - /3(27, Lfc/2J)) • Es[oa/ 2 ,cs[TLfc/ 2 j 


( 12 ) 


1 + 7fc 

The set is a uniformly random subset of S, therefore applying Lemma 

we get 

1 \Tk{S) 


Es[„,v2,cs[TLfc/2j(5'°’'/"))] > (^1- 


2VkJ 


(13) 


By substituting the lower bound (fO|) into Equation (IT^ we get that 

g ^^[0.1/2) (^)- 

1 \Tk{S) 


> 


> 




1 


1 + 7 ^ 

sub. j3 1 

1 + jk 

1 

> 




l- 7 . 4 v^ 27 ln(l/( 27 ))- 

'l V2 


2'/k j j 


\ Tk{S) 


1 + 7^ \ 2 


7 . 4^7 ln(l/ 7 ) - 


V\m 

fV2 5 ^ 1 ^ 
I 2 i/fc — 1 4'/k 


n(s) (14) 


where the last inequality in this derivation follows since [A:/ 2 J > {k — l)/ 2 . 
Now we give a lower bound on the second term of Equation (HU. By Lemma 
[5] (recall that |5'UI1| > 3|"fc/2]) we obtain that the expected profit of Ck,-y 
executed on input S U D during the time interval [1/2,1) is at least 


E 


> 




1 2 ^ .r\f. syiTTTfc l\TkiSUD) 

TT^il-v^-"(7)-4-^7) (^1-^21- 


1 


1 


1 + 7/0 \ 2 


V2 


9 


n - ^«(7) + 77 - 


A^/l + l/fc 1 




+ ^ T.(5), (15) 


where 0 ( 7 ) = 3 1 / 7 ln(l/ 7 ) as define in Theorem[3l The last equality follows 
since v(D) = 0, therefore Tk(S U D) = Tk(S). 


16 


















































Substituting the lower bounds (III)) and (ITSl) into Equation (fTTl) and separat¬ 
ing terms that depend on 7 from those that depend on k we get 


• (1 - (^^7.4V7ln(l/7) + ^a(7) + ^7^ 

_ ^ 1^1 \/l+^ I 

y 2 VfcTTT 4 ^ ^ 

We observe that for all 7 < 7 * = 0.003176 we have 


^7.4\/7ln(l/7) -f ^ 0 ( 7 ) -h ^7 


\/7ln(l/7) 


< 


^ + i\/k(I77)) 7Tln(l/7) 

7.3407561\/7ln(l/7) < • 


Similarly for fc > 25 we have 




1 


2 — 1 4-\/fc 

/ 5 1 


\/l -|- 1/k 1 

7^ ^ 


. -T 4” 71 4” 1 4“ —^ I —^ 

2 v'l-l/fc 4 ^ 2v^y Tfc 


< 


'72 


+ 1 + v /1 + 1/25+^)4. 


2fc 


2 v'l- 1/25 4 

_ 4.9783 5 

7^ 7^ 

Substituting (fT6)l into (fT0)l we obtain the statement of the theorem. 


(16) 


Lemma 1. For any set S 


Proof. Since runs exactly as Ck^-y in the time interval [0,1/2) the expected 
value of both of them in this interval will be the same. Hence all we need to 
prove is 


EeiSi-s-fo.i) 



> Egi.s^[04) 


V 





(17) 


The proof splits the probability space 0 : S i-)- [0,1) into subspaces. We prove 
that Inequality (ED holds for each subspace and therefore it holds for the entire 
probability space. 

We break the probability space by conditioning on the following: 
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- The subset of items of S arriving in [0,1 /2). We denote this subset by . 

Note that once we condition on the following are also fixed 

• The subset C S' of the items of S arriving in [1/2,1). 

• The threshold T that Ck,-y use to pick items from 

• The subset C of items in with value at least T. 


The time in which last chose an element in [0,1/2). Note that given 
this conditioning (and can first pick an item in [ 1 / 2 , 1 ) after time 

max {1/2, + 7 }. We denote this time by <5. 

The number of elements, A, picked by Ck,-y (and in [0,1/2). 


— The arrival times xi,... ,Xq € [1/2,1), q = 


dl/2,1) 


of the items in slV^’^^- 


(Note that we do not fix which item of S>t^’^^ arrives at Xi for any 1 < * < g.) 
We denote the set {xi | Xi > ^} by Xs- 


This conditioning determines the 7 -independent subset F of Xs in which 
Ck^'y picks items of Let my(Xs) denote the size of the maximum 7 - 

independent subset of Xs- The size of F is the minimum of fc — A and my{Xs). 

in a subspace defined by the 


The expectation of E 6 i. 5 ,_,.[o 7 ) (*5,6')^ 


conditioning above is the average over all 1-1 mappings of to the arrival 

times Xi,... ,Xq, of the values of the items mapped to F. In the following we 


denote this conditional expectation by E 




[ 1 / 2 . 1 ) 


fc,7 


(S) 


. Hence we get that 


E 




[ 1 / 2 , 1 ) 

>T 


q 

min{fc —A, my{Xs)} 




( 18 ) 


Our conditioning does not fix the 7 -independent subset F* of Xs in which 
Cl ^ picks items of because F* depend on 6d- We denote the expectation 


of E6 /:Si->-[0.1) 
by E 


V{S, 9)^ in a subspace defined by the conditioning above 




. As before we have 


E 


C\ 


*[ 1 / 2 , 1 ) 

fe,7 


(S) 


= E 


|E*| .„[1/2,1), 




E[|r*|] 


(19) 


where the expectation of F* is taken over all 1-1 mappings of to the 

arrival times xi,... ,Xq and over 9o. 

We can give an upper bound on the value of 10*1 for every 1-1 mappings of 
to the arrival times xi,... ,Xq and every 9d- Note that Cl ^ can select at 
most k items and therefore \F*\ < k — X. It also holds that F* is a 7 -independent 
subset of Xs and therefore |T*| < my(Xs) hence 


|T*| < min(fc — A, my{Xs)} 


( 20 ) 
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Substituting (l20l) into (fT^ and comparing it to (ITSl) we get that E 


(5) 


> 


E 


^dy2.i)(^) 


. Since this holds for any subspace define by our conditioning on 


the values of <5, A and {xi,... ,Xq} the lemma follows. 


□ 


Lemma 2. Let S be a set of size n and let Y be a subset of S chosen uniformly 
at random amongst all subsets of S. Then for any 25 < k < n: 


E 




> 1 - 


Proof. Erom Lemma [3] we know that 


E 


T| 


LIJ 


2y/k 




Tk{S) 


min(r, [fc/2j) 


We consider even and odd k. 
Eor k even; 



Tk{S)fl 1 


We derive (1311) as follows: 

— Prob[i?m(fc — 1, < I — l] =1/2. 

Consider k — 1 coin tosses of a fair coin, the number of heads can be 
0,l,...,fc — 1. The probability that there are i heads is equal to the prob¬ 
ability that there are k — 1 — i heads. As fc — 1 — (fc/2 — 1) = k/2, the 
set {0,... k/2 — 1} is disjoint from the set {fc/2, ... ,k} and their union is 
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Prob[i3m(fc, \) >^ + ^] — \ - (fc/ 2 ) 2 ^' 

Consider k tosses of a fair coin, the outcome can have 0,1,..., fe heads (an 
odd number of outcomes). As above, the probability that there be i heads, 
i = 0,1,..., fc/2— 1 is equal to the probability that there be k—i heads. Ergo, 
Prob[# heads £ 0,..., fc/2 — 1] = Prob[# heads G fc/2 + 1,..., fc], and 


Prob[# heads £ 0,..., fc/2 — 1 ] 

+Prob[# heads £ fc/2 + 1,..., fc] 

+Prob[# heads = fc/2] = 1. 

As the probability that there are exactly fc/2 heads is solving for 

Prob[# heads £ fc/2 + 1,..., fe] gives the desired result. 

From Stirling’s formula we get: 


/ fe \ J_ < 

Vfc/2j^ - 



1 


ey/k 1 
71" • fc y/k' 


By substituting (1231) in (l2^ we get 


(23) 




finishing the proof for fc even. 
For an odd fc: 


Tk{S)Y, 


r-=l 


k\ 1 min(r, [|J) 


2k 


k\ r 1 


r J k 2^ 


/LIJ 

= Tk{S) Y. 

( 1 /fc 1\ 1 

r=l 'y 

TkjS) ^ 

Tk{S)ffl 


- y 

k ^ \ r 

-=LIJ+i 


fc\ 1 


1/2'=-! 


fc-1 
2fc 


E 

r=[|J+l 


k\j_ 
r / 2^ 


Prob 


r / i\ 

fc 


fc- 1 


fc 

Bin I fc — 1. - I < 


- 1 

+ , Prob 

Binl fc,- ) > 


L V 2 ) 

[ 2 ] 


fc 

L V 2 ; 

L 2 J 


2 

Tk{S) 




2 

1 - 


fc- 1 
{k-l)l2j 2fc 


fc- 1 1 
fc ’ 2 


fc- 1 


1 


^(fc- l)/2y 2^ 2fc^ 

Lemma |4] shows by induction that for any odd fc > 25 

1 1 1 


fc-i 

(fc-l)/2 


20 


(25) 



























(Note that by (1^ we get that 


fc -1 \ 1 
{k-l)/2)¥ 


1 1/e 1 1 

— < -I-+ - 

2k 2 \7r ^/k — 1 k 


which is smaller than l/{2'/k) for k > 63.) 
Substituting (l25ll into (l24ll we get 


Tik/2i{S)> 


TkjS) A 1 


2 V 2Vk 


finishing the proof for k odd. 


□ 


Lemma 3. Let S = {si > S 2 > • • • > s™} and let S' be a subset of S 
uniformly at random amongst all subsets of S. Let t, k be integers s.t. t < 
Then 


chosen 
k <n. 


r=l ^ 2 

Proof Let R = {si > ... > s^} be the k elements of largest value in S. Condi¬ 
tioned upon |S" n i?| = r, the expectation of v{S' fl R) is r/k times the sum of 
the values in R which is Tk{S). 

The lemma now follows by summing over all possible values of r using the 
following two facts: 

1. The probability that |5" fl i?| = r is (^)^. 

2. If t < r then Tt{S') > Tt{S' fl i?) > ^v{S' fl R) and if t > r than Tt{S') > 
v{S'f^R). 

□ 


Lemma 4. For any odd k > 25 

/ k-1 \ 1 1 1 

\{k-i)/2)¥^¥- ¥7% 

Proof. We prove the lemma by induction on k 

Note that this inequality doesn’t hold for k <25. 
For simplicity, we prove the equivalent inequality: 


k-l 
{k-l)/2) 2^= 


i + Tl.2^<l 


For the basis of the induction we verify this inequality for k = 27: 
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We assume that the inequality holds for some odd k and show that it holds 
for fc + 2. In the following derivation Inequality (1^51) follows by the induction 
hypothesis. 


( {{k + L 1)72) ^ + 2(^) ■ 

/ fc(fc + l) 1 / fc - 1 \ 1 1 \ 

“ U(fc + l)/2)2 ■ 4 ■ V(fc - 1)72; ^ 2(fc + 2 )) 

- 

_ yfcyfcT2 y/kT2 I 
fc + 1 fc + 1 -^/fc + 2 

'/ky/k + 2 fc + I — (fc + 2) 
fc + 1 (fc + l)'\/fc + 2 

_ Vfc2 + 2fc + 1 - I 1 

fc + 1 (fc + l)y/k + 2 

^(fc + l)2_l 1 

fc + 1 (fc + I)'\/fc + 2 
<1 + 0 
= 1 


• 2VF+2 


(26) 


□ 


Definitions: The following definitions are used in Lemmata © - (HU). 

— Let = {yi > 2/2 > • • • > Vm} Q S he random variables for the set of 

items arriving during the time period [ 0 , 172 ), 

— and let = {zi > Z 2 > ■ ■ ■ > z\s\-m} C 5 be random variables for the 

set of items arriving during the time period [I 72 , 1 ). 

— Let Q = |{z £ 5 '[i/ 24 ) | 2 ; > T}| be the number of elements in greater 

than the threshold T. In the Algorithm [T] we define the threshold T where 
T = Ur, T = |"fe72], if > T and T = 0 otherwise (in this case we 

consider any item of value 0 as greater than T). 

— Let Gi, 1 < f < r, be identical independent geometric random variables, 

such that for any integer j > 0, Prob[Gi =j] = ■ It follows that 

the expectation of Gi, E[Gi] = 1, whereas the variance = 2. Let 

G = X)r=i Gi- Note also that E[G] = r and cr^[G] = 2t. 

— We abbreviate Eg. 5 ,_,,[Q i)[ti(Gfc_.y (S', 0))] as E[Gfc_.^(S)] as in the proof of The¬ 
orem [21 


Lemma 5. Given that |S| > 3|"fc72], for any 7 < 7 * = 0.003176 and any 
k < 177 ; tfce expected profit of Ck,-y during time interval [1/2, 1 ) is at least 


E 


^fc,7 


(S) 


> 


1 + 7fc 


^1 — V 2 ■ 0 ( 7 ) — 4 . 57 ^ 


1 - 


2'y/T-j-iyfc 

y/k 


l\Tk{S) 

kj 2 
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where a{^) = 3y^7111(1/7) defined in Theorem\^ 


Proof. Splitting the probability space by conditioning on Q we obtain 

00 

= ^Prob[Q = g] • E (5) \Q = q 

g=0 

2\k/2'] 

> ^ Prob[Q = g]-E[c'^f')(5) |g = g^ 

9=0 


( 27 ) 


Lemma [6] shows that E 


a 


[ 1 / 2 . 1 ) 

/c,7 


(S') I Q = g] > { 1 - V 2 - a(7) - 4 . 57 ) 


_ 2k_[^/3I - i) • Substituting this lower bound in ^ we obtain 


E 


7^[i/2,i) 


iS) 


> Y^2|'fc/2] 




Prob[Q = g] 


l+7fc 


1 — y /2 • a(7) — 4.57 


1 _ 2 1*?~ I 

k 


Tk{S) 

2 


= (1-V2-a(7)-4.57)^^ 

Prob[g = g](l - - i) 


Lemma [7] proves that if |S| > 3|"fc/2] then for any g < 2|’/c/2] the probability 
that g = g is the same as the probability that the sum of |"fc/2] identical 
geometrical random variables is equal g. We denote this sum by G and derive 
the following lower bound on the last sum of Equation dTSl) : 


2\k/2-\ 

^ Prob[g = g]( 1 - 2 

g=0 
2rfe/2i 

= ^ Prob[G = g](l-2 

9=0 

rfe/21 


k- rfc/2]| _ 1 

k k 


k- rfc/211 _ 1 

k k 


r—O 


= ^Prob[|G-rfc/2l| = r].(l-2.^-i 


k k 


> f; Prob[ IG - rfc/211 = r ] . ("l - 2 E - i) 

r—0 ^ ^ 

. o E[|g-riii] 1 

k k 

_ E[|G-E[G]|] 1 

k k ■ 


(28) 

(29) 


(30) 
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Equality (E51) follows by the change of variables r = \q— |"A:/2]|, Inequality (1^ 
follows since (l — 2-^ — <0 for any r > \k/2\ and the last equality follows 

since \k/2\ = E{G). 

Using the fact that E[|IU — E[IU] |] < Y^Var[IU] (Var here stands for variance) 
for any random variable W (LemmaH]) and the fact that Var[G] = 2|’fe/2] < fe + 1 
we get 

E[|G-E[G]|] ^ yfc+T y^l + TJk 
k Vk ■ 

Substituting the upper bound of Equation (1311) into Equation (1301) , and then 
substituting the resulting inequality into Equation (1151) we obtain the lemma. 

□ 


It may be useful to observe that lower bound given in the following LemmalHl 
is approximately equal to the expected size of the y-independent set chosen from 
the q items arriving during the time interval [1/2,1) divided by g, multiplied by 

the expected value of these q elements ((l ~ ~ 


Lemma 6 . For any 7 < 7 * = 0.003176, any k < I/ 7 , any S, jS'j > k, and for 
any 0 < q < 2\k/2\ we have 


E 


\Q = q 


> 




- 2 - 


|g-rfc/2ll 


l\Tk(.S) 
k) 2 


Proof. We condition upon there being q items exceeding the threshold in . 

Let 5 '>t be the set of all items in S greater than T. 

Assume > r (i.e., sufficiently many items arrive before time 1/2 so 

as to take yr as a threshold). Therefore there are exactly r — 1 items > yr in 
^[ 0 , 1 / 2 )^ and from the conditioning there are exactly q items > yr in so 

I'S'>t| = t — 1 + q is the number items in S that are strictly greater than the 
threshold yr- Thus, the probability that an item x € S^t is in is 


Prob 


X e \ Q = q 


g = g 
\S>t\ t + q-l 


If < T, the threshold is set to be zero, and S = 5'>t- The size of S 

is + q < T + q. It follows that the probability that an item x S «S'>t is 

in S'd/ 23 ) is 


Prob 


e I g ^ ^ 


> 


|5>t| T + q-l 


We now consider two cases: q < \k/2 ], and |"fc/2] < q < 2|’fc/2]. 
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For q < \k/2] the expected sum of the values of the q items in n5>T 


IS 


E 


({a; I a; e n 5 >t}) I Q = g = Y. Probja; e | g ^ ^ 


a:GS>T 

q 

I>s'>t| 




xGS^t 


> I ■ T,iS) 


(32) 


Equation (15^ follows since |S'>t| < t — 1 + q < \k/2] — 1 + \k/2] < k and 
Tk{S)/k (the average value of the k largest items) is smaller or equal to the 
average value of the items greater than T, (X)xgS>t ^(^)/I'^>t|)- 

Since q < \k/2\, Ck,-y chooses a 7 -independent subset of size at least as large 
as the size of the maximum 7 -independent subset of those among the q items 
arriving in the interval [ 1/2 -|- 7 , 1 ). 

Eixing the arrival times of the q items in which are above the thresh¬ 

old, the assignment of items to these times is a random permutation. This implies 
that the expected value of the items in the 7 -independent subset picked by 
equals to the cardinality of this 7 -independent subset divided by g, multiplied 
by the expected value of these items. Lemma ([S]) gives a lower bound on the ex¬ 
pected size of the 7 -independent subset picked by Ck,'y and Equation (1321) gives 
a lower bound on the expected value of the g items, therefore: 


E 

> 

> 


\Q = q 


1 

g [ 1 - 1 - 27 g 
1 




^1 — V2a{'y) — 47 ^ 

■(1-720(7)-47) • [ I ' TkiS ) 


l + -f{k + l/2) 

rhk ^ 1 +7+1/2) - 7 [| ^ 


1 


1 


> 


> 


> 


> 


l + lk 1+7/2- 1/(1++t))(' ^[i; 

TT 7 ^ - 7 [f ■ 

jT* ■ “ i/2a(7) - 47 ) [| ■ ms 

l^(l-V.„,7)-«7)(l-2M)lf> 

7 (i-^“'^'- 7 (i- 2 ™ 4 )¥^ 


-I- 'yk 

1 

1 + 

1 

1 + 


(33) 


(34) 


(35) 
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where Inequality [Ml follow since q < \k/2] and Inequality (1551) follows since 
> 1 + a: for any x. In addition we have that for any 7 < 7 * 

(l - ^20(7) - 4.57) > (1 - '/2a(7*) - 4.57*) > 0 , 

therefore all terms in the previous calculations are positive meaning that de¬ 
creasing any of them decreases the whole product. This concludes the proof for 
q < |■fc/2]. 

For \k/2\ < q < 2\k/T\, the expected sum of the values of the q items in 
5 [i/ 2 d) n 5 >t is 


E 


)({x I X G PI Sy.T}) \ Q = q > ^ Prob X G | Q = ^ . y(^x) 

X^S^T 

q 


> 


9 




q + T — 




xeSy 


> 


q + T 


—Tk{S) . 


(36) 


If < r the threshold is zero so 5'>t = S and therefore Inequality (1551) 

follows. Otherwise |S'>t| = r—l + q > \k/2] —l-f \k/2] +1> kso Syr contains 
the k largest items of S, and therefore Inequality (l36l) follows. 

Let s be the size of the maximum 7 -independent subset of the the q items 
larger than T arriving in the interval [1/2 + 7,1)- 

Since Ck^-y is restricted to choose at most [fc/2] items in the time inter¬ 
val [ 1 / 2 , 1 ) the size of the 7 -independent subset that Ck,-y chooses is at least 
min(s, [fc/2]). Therefore, 


E 


77 ''’(s) 


Q = q 


> E[min(s, \k/2\) \ Q = q] 
s ■ \k/2\ 


= E 


max(s, \k/2\) 

\k/2\ 


Q = q 


> E[s I Q = g] 


> 


> 


1 -(- 275 

q 


1 - 1 - 275 

where Equation (1371) follows from Lemma |9l 


(l_V2a(7)-47)) 
^1 — V 2 a('y) — 4 . 57 ^ 


q 

\k/2^ 


( 37 ) 
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Combining Equations (IMl) and (157)) using the same argument that we used 
to derive Equation (1551) we get 


E 


\Q = q 


1 

> - • 
9 


1 + 2'yq 


^1 — •\/2a(7) — 4.57^ 


\k/2^ 


9 + T - 1 


Tk{S) 


(38) 


A simple arithmetic manipulation (for more details see Lemma ITOl) gives us 

■fc' 


T + q — l 2 
Substituting (1551) into (1551) we obtain 


rfc/21 ^ 1 / 9-m 


E 


\Q = q]> - V^«(7) - 4 . 57 ) 


(39) 


1 (^ 1 - g /"^ Vfc(^) 


(40) 

Another simple arithmetic manipulation (for more details see Lemma 1TT1) gives 
us 


1 + 


2_ _ 9- \k/2] \ ^ 1 [i_2. g~ 

- 2^q \ k J ~ 1 + jk\ k 

\q-\k/2']\ 1 


1 - 2 - 


~ 1 + jk 

By substituting (I42|) into (1501) the lemma follows. 
Lemma 7. For any set S and any q < [S'! — r 


(41) 

(42) 

□ 


Prob[Q = 9 ] = Prob[G = q]. 

(Q and G are defined before\^. 

Proof. We reinterpret the random schedule as though we schedule the items in 
S in decreasing order. 

Repeatedly toss a fair coin until r “tails” appear. The length of this sequence 
is the sum of r geometric variables Gi (number of consecutive “heads”) plus r 
(number of “tails”), let G = ^ random variable for the total number 

of “heads” in this sequence. 

Traverse this sequence until its end or S is exhausted, schedule the top item 
remaining in S within the interval [1/2,1) if “heads”, and within the interval 
[ 0 , 1 / 2 ) if “tails”. 

If S has not been exhausted, place remaining items to appear at random 
times in time interval [ 0 , 1 ). 

If S was exhausted strictly before the end of the sequence — we have that 
G + T > [S'! (total number of coin tosses), and the number of items arriving in 
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the time interval [0,1/2) is < r, thus all items arriving during [1/2, 1 ) must be 
counted in Q and therefore Q > |5'| — t as well. 

If S was not exhausted strictly before the last coin toss, then G+t < [51, ergo, 
any item placed in the time interval [1/2,1) before the last coin toss contributes 
to Q, and any other item must not be counted in Q (either it was placed in time 
interval [0,1/2) or its value is less than the threshold value — the minimum of the 
T largest items scheduled during the time interval [0,1/2). Thus, Q = G < ISj—r. 

We conclude that G > jS'j — r iff Q > ISI — r and for any q< \S\ — t, G = q 
iS Q = q. So 

Prob[G > [S'! - r] = Prob[Q > |S'| - r], 
and for any q < |5'| — r 


Prob[G = q] = Prob[g = g]. 


□ 

Lemma 8. For any random variable W, with mean qi and standard deviation 

O’, 

E[\W-^l\] < a 

Proof. By the definition of the variance 

Var[|VP-Ai|] = E[\W-qi\^] -E'^[\W - fj.\] 

= Var[W] -E^[\W-fi\] 

Since Var[|W — /r|] >0 it follows that 

E^[\W -^i\] < Var[W] = . 

The lemma follows by taking the square root of both sides. □ 

Lemma 9. Let 0 < 7 < 1 and let F be a set of items scheduled at uniform times 
during the interval [1/2,1), and let f) = \F\. Let W be the subset of F that are 
scheduled during the time interval [I /2 + 7 ,1). The expected size of the maximum 
j-independent subset of items from F' is at least 

where a{'^) = 81/7 ln(l/ 7 ) (as defined in Theorem 0 ). 

Proof. If 7 > 1/2 the lemma holds since the bound is negative, therefore assume 
7 < 1 / 2 . 

Let p be the size of the maximum 7 -independent subset of F, and let p' be 
the size of the maximal 7 -independent subset of F' (both p and p' are random 
variables). It is easy to see that p' > p —1. 

Let I be an indicator variable where / = 1 if some item in F is scheduled 
during the time interval [ 1 / 2 , 1 / 2 - 1 - 7 ), and 7 = 0 otherwise. 
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E[p'] > Prob[J = 0] •E[p I / = 0] + Prob[7 = 1 ]• (E[p | / = 1] - l) 
= E[p] — Prob[ J = 1 ]. 


To bound E[p] we recall that S' consists of ip points randomly placed in the 
time interval [1/2,1). Every 7 -independent subset of W is analogous to a 27 - 
independent subset of '7' where 'P = {2p—l \ p G ^}. I.e., the size of the maximal 
7 -independent subset of S' is equal to the size of the maximal 27 -independent 
subset of 'F. Ergo, we can apply Theorem [3] and hence: 


E[p]-Prob[/ = l] > ^--p^^(l-a( 27 ))-Prob[/=l] 

4’ 


> 


1 -|- 2"pij] 

4 

1 -|- 27'0 
4 

1 -|- 2^'ip 

4 


1 — 0 ( 27 ) — 


(1 - (1 - 27)^)(1 + 27 V >) 

4 

(1-(1-27)’^) 


1 - 0(27) - ^ - 27(1 - (1 - 27)’^)^ 

(43) 


4 


1 _ „, 2 ^) _ _ 2 ., 


4 


(1 - 0 ( 27 ) - 47 ) 


> 


1 -i- 27 '!/ 

^-^^(1-37271^17(^-47) 

1:^(1-3v'27ln(l/7)-47) 
T7^(l-'/2a(7)-47). 


Where the Inequality (l43ll follows since for any n £ N and any x £ (0,1), 
(1 — a:)" > (1 — nx). □ 


Lemma 10. For any fc £ N and any q > \k/2~\ 


rfc/21 ^ 1 /^ (g-rf1) \ 

T + q — 1 2y k j 
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Proof. 


_\k/^^_\k/2^ 

T + q-1 \k/2] + q-l 

rfc/21 

2\k/2\+{q- \k/2\)-l 

rfc/21 

- 2\k/2\+{q- \k/2-\) 

A w-m) \ 

I 2 2[ii+(,-[ii); 

i_i g-rii 

2 2 2r|i 

- 2 2(fc/2) ) 





(44) 

(45) 

(46) 


where (Hll) . (HSl) and (H51l follow since q> \k/2'\. 


□ 


The following Lemma is used in the proof of Lemma IHl for bounding the 
profit of the algorithm from items in the 2 nd half, and conditioned on there 
being q such items above the threshold. 


Lemma 11. For any /c G N, \k/2'\ < q < 2|"fc/2] and any 7 that satisfies 
k < 1/j we have 


1 


1 + 2jq 


1 - 


\m\ 


> 


J 1 + 7fc 


1 - 2 


q — k/2 


(47) 


Proof. While the factor on the left in the left hand side of (1471) is smaller than 
factor on the left in the right hand side of (H71) , we show that this is compensated 
for the factors on the right of both sides of the Inequality. 

Note that for any \k/2~\ < q < 2|’fc/2], 


k ~ k 

k/2+ 1/2 

= l/ 2 .(l-l/fc) 

> 0 . 


(48) 


(49) 


The derivation (|48l) above follows by substituting the maximal value q can take. 
In Equation (|T7ll both factors 1/(1 + 2jq) (on the left) and 1/(1 + yfc) (on the 
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right) are strictly positive since both q and k are strictly positive. It therefore 
follows from that the left hand side of Equation (H71) is non negative. It also 
follows that if 1 — 2{q — kl2)jk < 0 then the right hand side of Equation (HTl) is 
< 0 and the lemma holds. 

Thus, we may assume that the right hand side of Equation (iTTl) is strictly 
positive, and thus that 


1 - 2 - 


q — k/2 
k 


> 0 . 


(50) 


As q> \k/2~\ it follows that q — k/2 > 0 and thus inequality (1501) implies that 


i_iz,ha> 0 . 


(51) 


As \k/2'\ > k/2 we can bound the left hand side of Equation (H71) as follows: 


1 


1 + 2'^q 


^ _ g- rfc/21 \ 


> 


1 


J 1 + 2'^q 


1 - 


q — k/2 
k 


(52) 


As the right hand side of Equation (|47)) is strictly positive, we can multiply and 
divide the right hand side of (|52|l by the right hand side of Equation (l47)) : 


1 


1 + 275 


1 - 


q — k/2 


1 + yfc 


1 - 2 - 


1 + yfc 

1 + 275 


q — k/2 


(53) 


1 - 


q-k/2 


1 - 2 


q—kj2 


Note the the right hand side above in Equation (1501) is of the form RHS (Equation 07]) 
times some factor t. By assumption RHS(Equation |47|) > 0 so, given Equation 
(I52|) and that t>l then the lemma holds, so it suffices to show that t>l: 


_ q-k/2 \ 

l + yfc k ) 

1 + 275 fi _ 2 . q-k /2 


(54) 


As 7 , k, q are non negative, and given Inequalities (150l) and (15lT) . it follows that 
the numerators and denumerators, in both fractions whose product is on the left 
hand side of Equation (1501) . are strictly positive. 

We substitute r for q — k/2, note that r > 0, it now follows that 


1 + 7 /c 

I + 275 



q-k/2 ^ 

fe ) 


( 1 - 2 . 


q-k/2 \ 
k ) 


i + 7fc (1- i) 
l + 27 (fc /2 + r) ( 1 -f) 

1 + 7 A: {k — r) 

1 + 7 fc + 2'^r (fc — 2 r) 


So it suffices to show: 


(1 + '-fk){k — r) > (I + 7 fc + 2^r){k — 2r) 
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or, equivalently, 


(1 + 'yk){k — r) — {1 + •yk + 2yr){k — 2r) > 0 
(1 + yk){k — r) — (1 + yfc + 2yr)(k — 2r) 

= (1 + yk)k — (1 + yk)r — (1 + yk)k + 2(1 + yk)r — 2yrk + 47 r^ 

= (1 + yk)r — 2yr ■ k + Ayr'^ 

= r(l + yk — 2yk + Ayr^ 

= r(l — yk + Ayr) 

> 0 

where the last inequality holds since fc < I /7 (and therefore ^7 < 1 ) and 47 r > 0 . 

□ 


C Proof of Theorem [3] 

Define 


a{x) = iyj X ln(l/a;), 

t = maxja; | a{x) < 1}, t = 0.032704 « 1/31. 

Theorem 3. Let Z = {zi, Z 2 , ..., Zn} he a set of independently uniform sam¬ 
ples, Zi, from the real interval [0,1). For 0 < 7 < 1, 

Ez[m{Z,y)] > ^ ^ 3 ^ 7 ln(l/ 7 ), (5) 

7 + 1 /n 1 + 717 

where m{Z,y) denotes the size of the largest y-independent subset of Z. 

We first give an overview of the proof of Theorem [U and then give the full 
proof in detail. 

C.l Proof Overview 

We need the following definitions for the proof of Theorem[3] Let Z = {zi ,..., z„} 
and define the random variable Xi, 1 < f < n to be the i’th smallest point in Z. 
Define the random variable Ci to be the number of that lie in the interval 
[Xi, XiA-y). At most one of the Xj G [Xi, XiA-y) can belong to a 7 -independent 
set. 

Given any set S of points we define the following greedy algorithm that con¬ 
structs a 7 -independent set. The greedy algorithm initializes the set with the 
smallest point in S and then traverses the remaining points in increasing order 
and adds the point x to the 7 -independent set if x is larger by at least 7 from any 
point previously added. We denote by G{S, 7 ) the 7 -independent set computed 
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by applying greedy algorithm to S. Let li be a random variable with binary 
values where /j = 1 iff € G{Z, d), and define pi = Prob[ Jj = 1 ]. 

Lemma proves the well known fact that the greedy algorithm picks a 
7 -independent set of largest size. So m(Z, 7 ) = \G{Z,^)\. Our proof uses this 
fact. 

We give an outline of the proof of Theorem [3] The details are in Section o 
The proof splits into two main subcases: 

“Small n”: In this case the expected number of non-overlapping intervals of 
the form (^i— 7 , Zi] that contain no point Zj, j i\s sufficiently large. Specifically, 
Lemma [13] shows that for n < e ■ ln(l/ 7)/7 the expected number of intervals 
of this form is at least Yq^(l — ^(y)). All such Zi will be chosen by the greedy 
algorithm thus proving the theorem for small n. 

“Larger n”: For n > e ■ ln(l/ 7 )/ 7 , we need to consider the expected 
number of points z^ “discarded” when the greedy algorithm chooses some Zi, so 
the key point in this part of the proof is to bound £[(7^ | L = 1 ]. 

Given a set of points Z, since the greedy algorithm picks a maximal 7 - 
independent set we have that m(Z, 7 ) = Therefore E[m(Z, 7 )] = ^^^=1 Pi 

So to prove the theorem we seek a lower bound for Pi- 

We note that Gi ■ h = n. This follows because (a) the Gi for which 
li = 1 are non-overlapping so this sum is < n, and (b) Gi ■ li < n implies 

that the greedy algorithm skipped over a point that should have been chosen. 
In particular, the expectation of ^i ’ ^i ^1®° n. 

Contrawise, it must be that 


E\ 




G,-L 


Y, E[a I /* = 1] • Prob(L = 1) = ^P.E[C', I A = 1]. 


If we could show that 

E[C'„L = l]</3 (55) 

for all i € { 1 ,..., n} then it would follow that Pi — giving us a lower 
bound on the sum of the pi’s. 

In fact we show (Lemma [18]) that the bound (|55l) holds for i = 1,..., [n/4j 
with 

,3=1 + 1 +-, (56) 

To get an intuition of why the upper bound 


E[a,ii 


1] < t + 1 -h 

n 


yn 



(57) 


holds for i < [n/4j, we note that for small 7 it is close to 1 -|- yn and we 
expect to see < yn points in an interval of length 7 following any specihc point. 
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Specifically, note that when 7 —>■ 0 since n > e ■ ln(l/ 7)/7 then n —>■ 00 and 


lim 

7—vO 


4 

3 




n 


so £[(7^ I li = 1] Ri 1 + 771. 

We prove Inequality (1571) for i = 1,..., [n/4j in two steps. Lemma ITOl shows 
that if the greedy algorithm chooses qi then with high probability 


X, < 




(58) 


(Informally this is to say that Xi is concentrated around its mean, which is 
ijin + I)). Next, Lemma [211 and Lemma [22] imply that if the greedy algorithm 
chooses Xi then 

¥.[C^\X, = Xi] = l + -^{n-i). (59) 

l-x^ 

Substituting Equation (1551) in (1551) we get (1571) . 

The claim that n = I = I] which we mentioned before, extends 

(with the same argument) to partial sums. I.e., for all fc < n 


k 

k < I /^ = 1]. 

i=0 


Using this with k = [n/4j and substituting the bound (I55p which we have 
proven for 1 < z < [n/dj, we get that Pi — Let Ai = 

{X,,...,Xy n/ 4 j}; "we have that the expected size of the maximal independent 
set in A is at least [n/4j//3. 

In Lemma|53|we show that for any 0 < z < 4 the expected size of the maximal 
7 -independent set in the point sets Ai = ) ^(i-i-i)[n/ 4 j} are equal 

(and thus at least [n/dj//?). 

It therefore follows that in the union A 1 UA 2 UA 3 UA 4 the size of the maximal 
7 -independent set is at least 4[n/4j//? — 3. Ergo, the size of the maximal 7 - 
independent set in gi,..., is at least (n — 1 )//? — 3 > n//3 — 4. As the greedy 
algorithm is optimal (Lemma 1171) it follows that the greedy algorithm gives a 
7 -independent set of size at least n//? — 4. 

The theorem follows by showing that for “large” n, 


n 


71 

0(l)>-^(l-a(7)). 

1 - 1-717 


C.2 Full Proof of theorem |3| 

Proof. Eirst let make the following definitions: 
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— Define Z be the set {zi, Z 2 , ■ ■ ■, Zn} where each Zi is sampled uniformly in 
[ 0 , 1 ). 

— Define the random variable 1 < i < n to be the f’th smallest point in Z. 

— Define the random variable Ci to be the number of Xj’s that lie in the 
interval lXi,Xi + 7 ). Note that at most one of the Xj S [Xi,Xi + 7 ) can 
belong to a 7 -independent set. 

— Given any set S of points we define the following greedy algorithm that 
constructs a 7 -independent set. The greedy algorithm initializes the set with 
the smallest point in S and then traverses the remaining points in increasing 
order and adds the point x to the 7 -independent set if x is larger by at least 
7 from any point previously added. We denote by G(5', 7 ) the 7 -independent 
set computed by applying greedy algorithm to S. Note that \G{Z,^)\ = 
m{Z,'y) (Lemma (IT^ l. 

— Let G{Z, 7 ) be a random variable which is the 7 -independent set obtained 
by the greedy algorithm when applied to the set Z. 

— Let li be a random variable with binary values where li = 1 iff G G{Z,d), 
and define 

Pi = Prob[Ji = 1]. 

For n < 2 -\/eln(l/ 7 ) • I /7 the statement follows from Lemma IT^ 

So from here to the end of the proof we assume that n > 2 -\/eln(l/ 7 ) • I /7 . 
Let ^ 2 , 1 ^ i < 4, be the set {^i+ 7 n/ 4 j ? • ■ • ? [n/ 4 j }■ 

Note that for any 1 < i < 4 

max(A,) = =min(A,+i) . 


It is easy to verify that 


E[m(Z', 7 )] > E 



3 

> ^E[TO(^i,7)] -3 

i =0 


By Lemma [23l E[m(Ai, 7 )] = E[to(^ 0 ) 7 )] for * = li2, 3. By Lemma IT^ m(Z, 7 ) 
is the size of the independent set picked by the greedy algorithm and by the 
definition of the pi’s we have that E[to(^o, 7 )] = Yl\=i^ Pi- ^o we get 


[n/4J 

E[m(Z, 7 )] >4- ^ K-3 . (60) 

i=l 

For simplicity lets r( 7 ) be as define is Lemma [T51 


r( 7 ) = k 





where k Ki 2.490795 is as defined in Lemma [inland / 3 ( 7 ) = 1 + is as 

defined in Lemma [27l 
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Substituting the lower bound on 
(|60|) we get 


Pi given by Lemma [16] in Equation 


E[m(Z, 7 )] “3 

4 1 + yn \ n J 

1 + yn \ n n 

= T^fl-r(y)---3(i+y 

1 + yn \ n \n 

= -r ^—- (ril) + - + 3y 
1 + yn \ \ n 


Since n > 2Y^eln(l/y) • 1/y then 


E[m(Z, 7 )] > (^1 _ (^r( 7 ) + 1 + 37 


>^(l_h(7)+ ^ ^ 


1 + yn 


2Ve i/ln(l/7) 


3y 


(61) 


Now we upper bound the terms ■r(y), 
start with r(y). 


75 ^)— , , and 3y for 0 < y < t. We 

2v^ Vln(l/7) 


r{l) = k\^ln 


1-7 


13(7)7 , 71 / , /3(7)ln(/3(7)) 

——(7 In (1/7)) +---y 


/ 71 / 1 7 /l log(l/(l-y)) , /3(y) , /3(y)ln(/3(y)) 

= V7ln(l/7)^7.y-. 


= V7ln(l/y) • kJ - 


1 log(l/(l-y)i/7) /3(^) /3(y)ln(/3(y)) 


In ( 1 /y) 


4 In (1/y) 


< \/yln(l/y) • k\ 


'1 log(l/(l - t)i/*) ^ /3(t) ^ /3(t)ln(/3(t)) 


ln(l/t) 


41n(l/t) 


where the last inequality follows since the functions ; In ( 1/(1 — y)^/'’') 

and /3(y) are monotonic increasing in y. So we have that 

r( 7 ) < ci-v/y In (1/y) 

where Ci = \ ’ ~ ^in ~ 2.058664 is a constant. 
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Now we look at the term ■ 

2yje 


•\/ln(l/7) 


7 


\/ln(l/7) 

< 


7 1 

111(1/7) 

7 

2-y/elii (1/t) 


1/7111(1/7) 


■ 1/7111(1/7) 


C2\/7 In ( 1 / 7 ) , 


where the last inequality follows since the function ( 1 /^) i is monotonic increas¬ 
ing in 7 , and C 2 = jt) ~ 0-620670. 

Similarly for the term 87 


=*l = Vk(Wv'Un(l/7) 

< 3W i^(^/^) \/7ln(l/7) = C3\/7ln(l/7) , 


where C 3 = ~ 0.293353. 

Summing all three term together we get that 

i’(7) + , + 37 < (ci -k C 2 -k C 3 )v' 7 ln(l/ 7 ) < 81 / 710 ( 1 / 7 ) . 

2 i/d vMW 

Substituting this bound back in Equation (I61I) completes the proof. □ 

The following lemma is well known and can be proved by induction on IS"!. 
Lemma 12. For any set S, |G'(S', 7 )] = m{S,^). 

The following lemma shows that when “n is small” as a function of 7 , Equa¬ 
tion (O holds; thus proving Theorem[3]for this case. In the proof of this lemma we 
will use some arithmetical lemmas which are proved later (Lemmas [24] through 

Ei) 

Lemma 13. For any 0 < 7 < t, if n < 2 Y^eln(l/ 7 ) • I/ 7 , then: 

E[m(Z, 7 )] > - 3 \/ 7 ln(l/ 7 )) . 

Proof. First we prove the lemma for n = 1 (Since we would like to use Lemma [26l 
that works only for n > 2). For n = 1, xi = zi is 7 -independent so E[m(Z, 7 )] = 1 
and the right hand side of Equation in the statement is at most 1. 

We now prove the lemma for n >2. 

For n > 2, using Lemma da we derive that 

E[m(Z, 7 )] > n(l - 7 )""^ 

>n(l- 7 )" 

= n((l-7)^/f” 

>n((l-7)/e)^" 

= n(l- 7 )T"*(l/e)^” . 
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(62) 

(63) 

(64) 































The expression in (15^ is no smaller than the expression in (1631) since (1 — 
> 1/e for any 0 < 7 < 1, see LemmaWe now give a lower bound for 
the terms in Equation (IMl) . For (1 — 7)'’"* we have 


(1 _ > (1 - 7)^\/«7 111(1/7) 


> 


1 

1 + 7n 


( 1 - 7 ) ■ 


(65) 

( 66 ) 

(67) 


Inequality (l65l) follows since n < 2-\/eln (I/7) • I/7. Inequality (l66l) follows since 
the function a;ln(l/a;) is smaller than I/e for every 0 < x, see Lemma 1251 In¬ 
equality (EZl) follows since (1 — 7) > 1/(1 -I- 777) for n > 2 and 0 < 7 < 1/2, see 
Lemma 

For (l/e)'*'", using the Taylor expansion of e~^ around 0 we get that for any 
X there exists 9 s.t. = 1 — x -I- (e“®)^ > 1 — x, so 


(l/e)^” > (1 - 7«) ■ 


Plugging these lower bounds back into Equation (IMl) we get 


E[m(Z,7)] > n(l - 7)'>'"(l/e)'^’" 

> n—. - (1 — 7)(1 — jn) 


> 

> 


1 -|- 777 

^ / -I 2 \ 

(^1 — 771 — 7 + 7 n) 

(1 - 777 - 7) 

(1 - 7 ■ 2v'eln(l/7) • I/7 - 7) 
(^1 - 2\/e • \/7ln(l/7) - 7) 


> 


> 


1 -b 777 

77 

1 -b 777 

77 

1 -b 777 

77 

1 -b 777 

77 

1 -b 777 

77 

1 -b 777 

77 

1 -b 777 


1 - 


a /7 In ( 1 / 7 ) ^ 2 Ve-b 



(l-3x// 


In 


( 68 ) 


(69) 

(70) 


Inequality (1551) follows since n < 2-^6 117 ( 1 / 7 )- I/7. Inequality (1551) follows since 
the function ( 1 /-^) monotonically increasing with 7 . Inequality dZOl) follows 
by substituting the value of t. □ 

Lemma 14. Prob[ L = 1 ] > (1 — 7 )"”^ • 

Proof. Note that for all 1 < i, fc < n, the probability that the fcth sample of 
Z is the ith smallest (i.e. the probability that Xi = yk) is exactly I/77. Also, 
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note that for any l<k,j<n, k^j, the probability that yj falls into the real 
interval [yk — l,yk) is at most 7 . It therefore follows that 

n 

Prob[ /j = 1 ] > ^ Prob[ = yk] ■ Prob [for all j ^ k,yj ^ [yu - 1 , yk)] 

k=l 

n 

= - E(i - = - • ^(1 - 7)”-' = (1 - 7)”-' • 

n ^' n 

k=l 


□ 


Lemma 15. E[m(Z, 7 )] > n(l — 7 )" ^ 

Proof. We have that E[m(Z, 7 )] = Jj] = X[r=i = 1] > n(l — 

7 )"“^, by Lemma [Ml □ 

Lemma 16. For any 'y < t, t = 0.032704, and any n > 2Y^eln(l/7) • I /7 we 
have that 

E ^ 7 ■ TT-“ ’'( 7 )-) 

•4—' 4 1 + 771 V n j 

i—1 ' ^ ' 

where 


r(7) 




(7 In ( 1 / 7 )) 


/ 3 ( 7 )ln(/ 3 ( 7 )) ^ 

^^7 


and k « 2.490795, is defined precisely in the proof below and /3{'-f) = l + in(i/ 7 )-i 
is define in Lemma [^7[ 

Proof. From Lemma [13 it follows that [^J < X[|=i I = !]• Using 

Lemma [TSl to upper bound ElCi | L] we get that: 



where 


9{n,7) 



Inn 

n 


Now we give an upper bound for the factor 1 + — ^7^ in Equation ([7T|). 
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The function (?(n, 7 ) is increasing with 7 for 7 e [0,t) and decreasing with n 
for n > e (use Lemma [25] with x = 1/n). Therefore for n > 2Y^eln(l/7) • I /7 

and for 7 < t we have that g{n,'-f) < = h ^ 0.463713. Since 

for any x ^ 1 we have that = 1 + x + and since g{n, 7 ) ^ 1 we get that: 


771 


1 


= 1 + 7n( 1 + 3(77,7) +g(n,7)^ 


= (1 + 777) + 777 • 3(77,7) + 
< (1 + 777) + 777 • 3(77,7) 1 + 


1 - 5(^,7) 

g{n,7) 

1 - 3(77,7) 
h 

1 - /i 


= (1 + 777) ( 1 + -^^3(77,7) ( 1 + 
' 1 + 7 ? 7 , ' 


h 


1 — h 


Next, we give an upper bound on 7)- 


1 + 


777 


1 — h J 1 + 777 


g{n,l) 


= 1 + 


h \ 777 4/1 1 In 77 


In ■ 


1 — /7 y 1 + 777 3 y 2 1 — 7 77 


= k\ —\n- 


777 


+ 


In 77 / 777 


2 1 — 7 y 1 + 777 y 77 \ 1 + 777 


, /1 , 1 In 77 / 777 

<k\ -\n- ' ' ' 


2 1 — 7 r7\l + 777 


= few - In 


: In? 


2 1 — 7 (77 + 1 / 7 )^ 


< k\ —\n- 


2 1-7 


«i)(^i„(irt)) + «2)h(«i)). 


= ’’( 7 ) 


(72) 


(73) 

(74) 


where k = |^1 + and r( 7 ) is as defined in the statement of the lemma 

above. Inequality (1731) follows from Lemma |27l 

Substituting the bound from Equation (TM)) into Equation (I72p and the bound 
from Equation (1721) into Equation (ITIT) we get 


77 

-4- 


^ L"/41 

< ^ + (1 + in){l + r(7)) ^ p. 
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Isolating obtain 


L«/ 4 J 

^ Pz > 
1 

> 


> 



--- 

4/ (1 + 7 n)(l + r( 7 )) 


n, , ,, 1 

^ t(1 ~ ''’( 7 )) ~ - 

4 ^ 1 + 7 n 


1 


4 1 + 771 


1 - r(7) - 


where Inequality (1751) follows since 1/(1+x) < 1 — a; for a; > — 1. 


Lemma 17. For all 1 < k < n, 'Yl,i=iPi^[^i I 7^ = 1] > fc. 


(75) 


□ 


Proof. Recall that Xj is a random variable equal to the jth point from the left. 
For a given instantiation of the Xj let Qk be the subset of the k leftmost points 
that were chosen by the greedy algorithm. Note that i G Qfc iff the instantiation 
of li is one. 

The union of the intervals [Xj,Xj + 7 ), j G Qk, must include the instan¬ 
tiations of Xi,... ,Xk. This follows since otherwise there would be some point 
that was not selected by the greedy choice and that could be selected, contra¬ 
dicting the definition of the greedy algorithm. Hence: Taking 

Expectations we get that 


k 

k<Y, E[a • li] 

2=1 


k 


= ^Prob[Ji = 0]-E[C',-0|L 

2=1 

k 

= I/* = !]• 

1=1 


0] + Prob[/i 


IJ-EiQ-l I h = 1] 


□ 


Lemma 18. Given any 0 < 7 < 0.032704, integer n > 2^e\a.(l/^) ■ I/ 7 , and 
integer 0 < i < njA we have that 


E[C, I h 


1 ] < 1 + 


771 



+ 1/71. 

In n 
n 
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Proof. Define 


i - 1 

di — 

n 



(76) 


We have that 

E[Q I /, = 1] = Prob[(W <af)\Ii = l]- E[Q | {X, < a,) A /, = 1] 

+Prob[X, > a, I /i = 1 ] • Ep, I (X, > a,) Mi = 1](77) 

Eor the second term, from Lemma m we derive that Prob[Xi > | = 1] < 

In addition, since the number of points to the right of (or at) the ith point 
from the left is n — i + 1, we have that 


E[C'i I {Xi > Oi) A li = 1\ < n — i + 1 < n. 


Hence, for the second term 


Prob[W >a^\h = l]■P[C^ \ (X, > a^) Ah = l]<^. 


(78) 


Eor the first term we apply Lemmata [2l1 (in Equation (IT^ ') and [22] (in In¬ 
equality (HOj)) to derive 

E[Q I (X, < ai) Ah = l] 

= / Prob[ Xi = X \ Xi < at A li = \ ]Ei[Ci \ Xi = X A li = \]dx 


Prob[Xi = X \ Xi < Qi A li = IJE)^^ | Xi = x\dx 


(79) 


< j Prob[Xi = a: I Xi < Oi A/i = I ] ^I-I- -{n — i)jdx (80) 

< f Prob[ Xi = a; I Xi < Oi A Ji = 1 ] ( 1 -|- — {n — Mdx 


< 1 


1 - Oi 

—-—(n — z) ) [ Prob[Xi = a; I Xi < Oi A/i = I ]fia 
1 - a* J Jo 

= l + —^{n-i) 

1 - Oi 

By substituting (1781) and (IMll into Equation (iTTll we get that: 

7(n — i) 1 


E[C, I /. = I] < 1 + 


1 - Oi 


(81) 


(82) 
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Substituting the value of Ui from Equation (1751) in Equation (1571) we get 


7 (n — i) 
1 - a,: 


■j(n — i) 


1 - ^ - Ji In 

n V ^ 1—7 n 

'yn{n — i) 


< 


7 n(n — i) 


(n - z) - In ^ 
7n 




771 




Since i < n/4 we get that: 


n 



n — i 


+ 


In n 
n 


< 



4 

3 



Inn 

n 


(83) 


(84) 


By the assumption that 7 < 0.032704 and n > 2-Celn (I/ 7 ) • I /7 it follows 
that Expression ((84l) is strictly less than one. Thus, we can substitute Expression 
((Ml) into Equation (1M1) and derive 


1 - a,; 


771 




In n 


(85) 


By substituting the upper bound in Equation (IMl) into Inequality (IMl) we 
derive the statement of the Lemma. □ 


Lemma 19. For any \ <i <n, 


Prob 


X, > 


i — 1 


n 
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Proof. Define ai = + y 5 . From Bayes rule and Lemmata [14] 

and |20| we get that 


Prob[Xi > a\li 


Prob[Xi > Oi A li = 1\ 
^ Prob[/, = 1] 

^ Prob[X^ > g,] 

“ Prob[/i = 1] 

- (1 



( 86 ) 


We use Lemma Uni to bound the numerator in Inequality 
bound the denominator in Inequality 


and Lemma [T41 to 
□ 


Lemma 20. For 1 < i < n, 
i — I 


Prob 


X, > 



< 4 ( 1 - 7 )^ 


Proof. The probability that Xi>p for some p G [0,1) is exactly the probability 
that for some k < i — 1 exactly k of the random points lie in [0,p]. That is 


PT0h[X,>p]=J2{l)p\^-Pr-^ (87) 

k=0 ^ ' 

Notice that the right hand side of Equation (1571) is exactly the probability that 
a Binomial random variable with n trials and success probability p {Bin(n,p)) 
is at most i — 1. So we can apply Hoeffding’s inequality El to get 


Prob[Xj > p] = Prohl Bin{n,p) < i — 1] 


= Prob 


Bin{n,p) < ( p — ( p — 


By choosing p = ^ In ^ we get that 

Prob[ W >p]< 

— n In 2 In n 

— g 

(1-7)” 

r,2 



n 


finishing the proof of the lemma. □ 

® Recall that Hoeffding’s inequality for a Binomial random variable is 
Prob[ i3m(n,p) < (p — e)u] < We use this with e = p — 
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Lemma 21. For any 0 < i < n, conditioned on the event Xi = x, the random 
variables Ci and li are independent. Formally, 

Prob [Ci < k \ {Xi = x) /\ li] = Prob[< fc | {Xi = a;) ] . 

Proof. Given that Xi = x, the set A of the i items arriving at times < x, and the 
arrival times of the items in A (that is Xi,... the action of the greedy 

algorithm on the ith item arriving at x is determined (that is L is determined). 
On the other hand, Ci depends only on the arrival times of the points not in A 
(that is Z\A). Since this holds for any set A, and for any arrival times of these 
points, it also holds without conditioning on A. □ 

Lemma 22. For any 0 < i < n, 


E[Ci \ Xi = x] < 1 ^ {n - i) 

I — X 

Proof. Conditioning on Xi = x, there are exactly n — i points each distributed 
uniformly in [x, 1). Let {Zi,... Zn-i} he n — i independent random variables 
each distributed uniformly in [a;, 1). Since Ci contains the ith point and any of 
the following points the falls in [x, x + 7) we have that 

|X, =x] = l + E[|{Z, <7}|] 

n—i 

= 1 + Prob [Zi < j] . 

i=l 

If X < 1 — 7 then Prob[^i < 7] = otherwise Pioh[Zi < 7] = 1, therefore 


E[C, I X, 



Lemma 23. For all i,k > 1 such that i + k <n, 


E[m({Xi,..., Xfc}, 7 )] = E[m({X,+i,..., X,+fe}, 7 )]. 


□ 


Proof. The number of 7 -independent points are invariant under translation and 
rotation, i.e., for any real x and for every 1 < ^ < n, and for any set of points 
{xi,...,Xi}, 


m({xi,... ,x^}, 7 ) = m({x - xi,... ,x - X£},j). ( 88 ) 
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Assume that i+k = n. The vectors (Ai,..., Xk) and (1—A„,..., 1—Xn-k+i) 
have the same distribution, and the lemma follows from (1551) . 

Otherwise, (i + k < n), we condition on Xi^k+i = x. It now follows that the 
vectors (Ai,..., Afc) and (x — A^+fe ,... ,x — A^+i) have the same distribution. 
Therefore, it follows from (1551) that 

EAi, . . . , Xk \, ^) I Xj^j^k+l — x\ — E[77i({, . . . , Xij^k T) I Ai_|_fc-|_i — x]. 
Since the above holds for every x, the lemma follows. □ 

We now give several technical lemmata required to conclude the proofs above. 

Lemma 24. For any 0 < a < 1 

> i . (1-a) 

e 

Proof. Eor any x such that 1 — a<x<lwe have that 1 < 1/x < 1/(1 — a). So 
by integrating from 1 — a to 1 we get that 



f —dx<f 1/(1 —a) dx 

J 1 — ct ^ J 1—a 



ln(l) — ln(l — a) < a/(l — a) 



ln(l a)> 

1/a — 1 



ell'll “) > e 



(i — a) > e i/“”i 

(89) 


(l-a)i/“-i > - 

e 

(90) 


(l-a)i/“ > l-(l-a) . 

e 

(91) 

Inequality (|Ml follows from (ISOj) by taking the (1/a — 1) power 
and (1M1) follows from (1501) by multiplying both sides by (1 — a). 

of both sides 

□ 


Lemma 25. For x > 1/e, 

\/e • xln(l/x) 

is monotonically decreasing, and for any x > 0, 

\/e ■ xln(l/x) < 1 

Proof. Let /(x) = xln(l/x). Then f'(x) = — Inx — 1 which is positive for any 
0 < X < 1/e and negative for any x > 1/e. Therefore the maximum of / is 
obtained at x = 1/e. It follows that for all x > 0 

\/e • xln(l/x) < yje- ln(e) = 1 . 

□ 
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Lemma 26 . For any n > 2 and any 0 < 7 < 0.5 


(1 -7) > 


1 

1 + "fn 


Proof. It suffices to prove the above inequality for n = 2 , for n > 2 the right 
hand side can only decrease whereas the left hand side does not depend on n. 

Now, (1 — 7)(1 + 27) = 1 + 7 — 27^, and 7 — 27^ > 0 for all 0 < 7 < 0.5 so 
the lemma holds. □ 


Lemma 27 . For any 0 < 7 < 1 /e we have that 


nlnn 

(n+ 1/7)2 - 



• 7ln(l/7) + 


^ /3(7) In (/3(7)) ^ 


■ 7 , 


where is defined to be 1 + 

Proof. Fix 0 < 7 < 1 /e, define h(n) = and let no = argmax„>2 h{n). 

Note that /3(7) > 1 for 0 < 7 < 1 /e, we show below that 

1/7 < n-o </ 3 ( 7)/7 • ( 92 ) 

Assume that Equation ( 1921 ) holds. As both nlnn and (n + 1/7)^ are monotoni- 
cally increasing in n, it follows that for all n > 1 


h(n) < h{nfi) 

_ noIn(no) 

(no + 1/7)2 

/ (/3(7)/7)ln(/3(7)/7) 

- (1/7+1/7)^ 

= ln(/3(7)/7) 

= ®-7ln(l/7) + ®^^-7, (94) 

where Inequality (IMl) follows from the two bounds in Equation ( 1 ^ , Equation 
(IMl) is the statement of the Lemma. 

It remains to prove the inequalities in ( 1921 ) . We show below that for all 
1 < n < 1/7: h'{n) > 0, and that for all n > P('y)/j. h'{n) < 0, this proves 
Equation (|^ . 

The derivative of h (with respect to n) is 


,,, , (Inn + l)(n + 1/7)2 - 2(n + l/7)nlnn 

^ (^) =- 1 —rrm- 

(n + 1/7)4 

(Inn + l)(n + I/7) — 2 nlnn 
(n + 1/7)3 

(n + 1/7) + (1/7 — n) Inn 
(n + 1/7)3 
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as the denominator of h'{n) is positive, the nominator determines the sign. Ergo, 
it is enough to look at the sign of 

— • (771 + 1 + (1 — 771 ) In n ), 

7 

and, again, as I /7 > 0 , this is equal to the sign of 

k{n) = 771 + 1 + (1 — 777 ) In n. 

For 1 < 77 < 1/7, it must be that k{n) > 0, and hence h'{n) > 0. 

As / 3 ( 7 ) > 1 and n > /3(7)/7 we have that n > I /7 and 777 > > 1 . 

Now, we have that 

k{n) = "fn + 1 + (1 — 777 ) ln( 77 ) 

/ 2 

= (777 — 1 ) 1 H-In 77 

\ 777 —1 

<(7«-l)(l + ^^:^-ln(l/ 7 )) 

= (/3(7) - l)(ln(l/7) - 1)) 

= ■ ln(l/7)-l ■ ■ ^0 

= 0 . 

thus concluding the proof of (IMl) and the lemma. 

□ 


D Expected size of the Maximum capacity d 
7 -Independent Set (d identical machines) 

A capacity d 7-independent set is a set of “feasible rentals”, given that d items 
can be rented, and each item is rented for a period of length 7. Equivalently to the 
definition in Section^ a capacity d 7-independent set is a set of points S C [ 0 , 1 ), 
such that given any subset of intervals /, |/| > d, I C {[t,t -I- 7) | t G S'}, the 
intersection of all intervals in / is empty. The unit interval graph is defined by 
intervals of length 7 whose left endpoints are the points of Z. 

Let md{Z, 7) denote the size of a maximum capacity d 7-independent subset 
of a set Z. In this section we study the expectation of md{Z, 7) when the points 
of Z are chosen uniformly at random in [ 0 , 1 ). Specifically we prove: 

Theorem 5. Let Z be a set of n points ehosen uniformly at random in [0,1), 
and let "f = 1/k, for some integer k >2. Then we have 

F.Z [md{Z, 7)] > min (77, d • I/7) • (l-oi j j ■ ( 95 ) 
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Proof. Let 1" be a random subset of Z of size min (n, T) where 
T = {d — y/3d\nd) -1/7 + 1. Since for any subset Y C Z, md{Y, 7) < md{Z, 7) a 
lower bound on md{Y,"f) is also a lower bound on md{Z,j). As .Z is a random 
set of points and + is a random subset of Z, choosing Z first and then choosing 
Y gives the same distribution on + as does simply choosing + at random. 

Let Y' = {x G Y \ \[x — r\Y\ < d} be the subset of + containing only 
points X that do not have d points in an interval of length 7 ending at x. So 
Y' is a capacity d 7-independent set (as the greedy algorithm applied to +' will 
pick all the points). Therefore for any fixed Z, 

EyczK(r',7)] = Evcz[|l"'|]. (96) 


Since Y' C_Y C Z we have 

Ez[EyczK(F',7)]] < E2[EyczK(E,7)]] < Ez[md(^,7)] (97) 

From (|M1) and (|^ follows that it suffices to show a lower bound on Ez [Eycz [|E'|] 

Yz\Ey<zz[\Y'\]] 

= X] l-Pi'0 b[|[y- 7 ,y)ny| < d] 

yeY 

> |y| • Prob[i?m(|y| — 1,7) < d— 1] 

= |F| • (1 — Prob[i3m(min (n, T) — 1,7) > d]) 

> |F| • (1 — Prob[i3m(r — 1),7) > d]) 

d- (r-i) 7 \ 


= |y| • ^1 — Prob 
= |y| • ( 1 — Prob 


Bin{T — 1,7) > 1 + 


(T-lh J 


(T-1)7 


Bin(T n + 1)7 


(98) 


The derivation following the 4 th line above follows since d = (1 + (d — (T — 
l)j)/{T— 1)7) (T— 1)7. Applying the multiplicative form of the Chernoff bound 

(Prob[A > (1 + e)/j,] < e~^~ when X is sum of n IID random variables and 
Ai = E[A] on Bin{T - 1,7) with e = we obtain 


Eycz[|y'|] > |y| ■ (^1 - e -(T 1)7 

= 1+1 • (1 - 

( d In d \ 

1 _ e d-YJdhn j 

> l+l- (l-e-^J 


= 1+1-(1-1/d) . 
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Substituting |y| = min ^n, {d - 

Ercz[|^'|] > min 

> min 

> min 

= min 


VSdInd) -1/7+1^ we get 


(^n, (^d — \JM In d^ ■ 1/7^ • (1 — 



1/d) 

1/d) 


□ 


Theorem 6. Let Z be a set of n = d ■ l/"f points chosen uniformly at random 
in [0,1) and let 7 = 1/fc (for some integer k >2). Then we have 


Ez[md{Z,'y)] < n ■ 


i-e 



(99) 


Proof. For any 0 < i < I/7 define C Z to be the set {z G Z | z G [17 , {i + 

1)7)}, all the points arriving at the i-th slice of size 7. 

Obviously Ui<i<i/7 Therefore 

1/7 

md{Z, 7) < X] 7) (100) 

i=0 

To give un upper bound on the size of md{Zi,j) notice that if |Zi| < d then Zi is 
a capacity d 7-independent set, thus md{Zi,j) = \Zi\. Otherwise, any subset B 
of Zi is capacity d 7-independent set iff |_B| < d. Thus md{Zi, 7) = d. Combining 
these observations we get 

Pz[mdiZi,j)] =Pz[min{\Zi\,d)] . ( 101 ) 

\Zi\ count the number of points that fall into the line segment [17, (*-|-1)7) when 
we throw n = d/7 points uniformly at random into the line segment [ 0 , 1 ). Ergo, 

\Zi\ is distributed as Bin{d ■ I/7, 7), therefore 

Ez[md(Zi,7)] = E[mm{Bin{d ■ I/7, 7),d)]. 

From Lemma ((28l) (see below) and the equation above we conclude that Ez[md(Zi, 7)] < 
d(l — • ^ 7 "’' ) ■ Substituting this bound into (IIOII) and then into (llOOl) we 
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obtain 


1/7-1 


V^l - 7 


Ez[TOd(Z,7)]< ^ Ez[md(Zi,7)] < 1/7-d 1 


Vd 


< n 



= n{l - e{^/lJd)), 


finishing the proof of the theorem. 


□ 


The following lemma deals with the following experiment; Toss n coins with 
probability p of heads, return the number of heads if the number of heads < 
fi = np (the expectation), otherwise return p. What is the expected value of 
this experiment relative to pi The probability that the outcome exceeds p + ta 
decreases exponentially with t, where a is the standard deviation {^/p in our 
case). This suggests that the expected difference is 0(1) standard deviations, 
which is the claim of the next Lemma: 

Lemma 28. For any nCN, 0<p<l, and p = np, we have that 



Proof. 


E[min {Bin(ri, p), p)] 




= p ■ PToh[Bin{n — l,p) < p — 1] + p ■ PToh[Bin{n,p) > p + 1] (102) 

We describe a single experiment with two disjoint events A, and B, such that 
Prob[A] = Prob[i3m(n — l,p) < p — 1]^ andProbjS] = Prob[i3m(n,p) > /i + 1]. 
This will be useful as then the probability of {A or B) is simply the sum of these 
probabilities. The experiment is to toss n coins, where the probability of heads 
is p, 
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— Event A occurs if amongst the first n — 1 results there were no more than 
/j, — 1 heads, Prob[^] = Pioh[Bin(n — l,p) < p — 1\. 

— Event B occurs if in total, over n coin tosses, there were at least ^+1 heads, 
Prob[i3] = PYob[Bin{n,p) > /r + 1]. 

— It is easy to see that if A holds then B cannot occur. Likewise, if B holds 
then A cannot occur. 

Thus, 


Prob[ A y B \ = Prob[ A] + Prob[i3] 

= Prob[i3in(n — \,p) < ^ — 1] + Prob[i?*n(n,p) > /r + 1] (103) 

On the other hand, the complimentary event to A V B, A V i?, occurs if during 
the first n — 1 tosses there were exactly p heads and the last coin toss gave tails. 
Therefore 


Prob[ A V B ] = 1 — Prob [ A V i? ] 

= 1 — Prob[Bm(l,p) = 0] • Prob[i?m(n — l,p) = p] 

= \ — (1— p) ■ Prob[i?m(n — l,p) = p] (104) 

Substituting p04l) into (11031) and the result into (11021) we obtain 

E[min {Bin{n,p), p)] = p[l — {1 — p) ■ PToh[Bin{n — l,p) = p]). (105) 

To finish the proof we bound (1—p)-Prob[i?m(n — l,p) = /x], recall that p = np: 


(1 — p) • Prob[i3m(n — l,p) = p\ 
'’n—l 
np 


= (1 -p) 


p-P(l-p) 


n—l—np 


> 


(np)! • (n — 1 - 

-1 

- np)l 

n — np 


nl 

n 

(np)! 

■ (n — np)! 

n — np 

y/^- 

n^y/n 


p"p(i-p)"-”p 




pU-np 


e ■ {np)‘^P ■ y/np e ■ {n — np)'^ "p • y/n — np 
n — np y/n n" 


g 2 ^ y/np • y/n — np • p”P • • (1 — p)"“"P 

y/^ n — np y/n 

n y/np ■ y/n — np 

yf^ y/\-p 


,pnp(l_p)n-np ( 106 ) 


.pnp(i_p)n-np 


yfji- 


(107) 
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where Inequality (I106D follows from Stirling’s lower bound on n\ in the numerator 
and Stirling’s upper bound on (np)\ and (n — np)\ in the denumerator. 

A similar proof, using Stirling’s upper bound on n\ and Stirling’s lower bound 
on {np)\ and (n — np)\ yields 


(1 — p) • Pioh[Bin{n — l,p) 


Ai] < 


e yi-p 

2tt 01 


(108) 


Assigning bound (I108D and (I107p into equation (11051) gives the statement of the 
lemma. □ 
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